智能论文笔记

Learning List-Level Domain-Invariant Representations for Ranking

Ruicheng Xian , Honglei Zhuang , Zhen Qin , Hamed Zamani , Jing Lu , Ji Ma , Kai Hui , Han Zhao , Xuanhui Wang , Michael Bendersky

分类：人工智能 | 自然语言处理 | 机器学习

2022-12-21

Domain adaptation aims to transfer the knowledge acquired by models trained on (data-rich) source domains to (low-resource) target domains, for which a popular method is invariant representation learning. While they have been studied extensively for classification and regression problems, how they apply to ranking problems, where the data and metrics have a list structure, is not well understood. Theoretically, we establish a domain adaptation generalization bound for ranking under listwise metrics such as MRR and NDCG. The bound suggests an adaptation method via learning list-level domain-invariant feature representations, whose benefits are empirically demonstrated by unsupervised domain adaptation experiments on real-world ranking tasks, including passage reranking. A key message is that for domain adaptation, the representations should be analyzed at the same level at which the metric is computed, as we show that learning invariant representations at the list level is most effective for adaptation on ranking problems.

translated by 谷歌翻译

HYRR: Hybrid Infused Reranking for Passage Retrieval

Jing Lu , Keith Hall , Ji Ma , Jianmo Ni

分类：自然语言处理

2022-12-20

We present Hybrid Infused Reranking for Passages Retrieval (HYRR), a framework for training rerankers based on a hybrid of BM25 and neural retrieval models. Retrievers based on hybrid models have been shown to outperform both BM25 and neural models alone. Our approach exploits this improved performance when training a reranker, leading to a robust reranking model. The reranker, a cross-attention neural model, is shown to be robust to different first-stage retrieval systems, achieving better performance than rerankers simply trained upon the first-stage retrievers in the multi-stage systems. We present evaluations on a supervised passage retrieval task using MS MARCO and zero-shot retrieval tasks using BEIR. The empirical results show strong performance on both evaluations.

translated by 谷歌翻译

EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks

Liyu Shi , Xiaoyan Li , Weiming Hua , Haoyuan Chen , Jing Chen , Zizhen Fan , Minghe Gao , Yujie Jing , Guotao Lu , Deguo Ma

分类：计算机视觉

2022-12-01

Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients.

translated by 谷歌翻译

Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

Shuquan Ye , Yujia Xie , Dongdong Chen , Yichong Xu , Lu Yuan , Chenguang Zhu , Jing Liao

分类：计算机视觉 | 人工智能 | 机器学习

2022-11-29

This paper focuses on analyzing and improving the commonsense ability of recent popular vision-language (VL) models. Despite the great success, we observe that existing VL-models still lack commonsense knowledge/reasoning ability (e.g., "Lemons are sour"), which is a vital component towards artificial general intelligence. Through our analysis, we find one important reason is that existing large-scale VL datasets do not contain much commonsense knowledge, which motivates us to improve the commonsense of VL-models from the data perspective. Rather than collecting a new VL training dataset, we propose a more scalable strategy, i.e., "Data Augmentation with kNowledge graph linearization for CommonsensE capability" (DANCE). It can be viewed as one type of data augmentation technique, which can inject commonsense knowledge into existing VL datasets on the fly during training. More specifically, we leverage the commonsense knowledge graph (e.g., ConceptNet) and create variants of text description in VL datasets via bidirectional sub-graph sequentialization. For better commonsense evaluation, we further propose the first retrieval-based commonsense diagnostic benchmark. By conducting extensive experiments on some representative VL-models, we demonstrate that our DANCE technique is able to significantly improve the commonsense ability while maintaining the performance on vanilla retrieval tasks. The code and data are available at https://github.com/pleaseconnectwifi/DANCE

translated by 谷歌翻译

Towards Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline

Lichen Zhao , Daigang Cai , Jing Zhang , Lu Sheng , Dong Xu , Rui Zheng , Yinjie Zhao , Lipeng Wang , Xibo Fan

分类：计算机视觉

2022-09-24

最近，3D视觉和语言任务吸引了不断增长的研究兴趣。与其他视觉和语言任务相比，3D视觉问题回答（VQA）任务的利用较小，并且更容易受到语言先验和共同参考的歧义。同时，由于规模和注释方法有限，最近提出的几个3D VQA数据集并不能很好地支持3D VQA任务。在这项工作中，我们通过收集一个新的3D VQA数据集（称为FE-3DGQA），正式定义和解决3D接地的VQA任务，并具有多样化且相对自由形式的提问，以及密集和完全接地的边界框注释。为了获得更多可解释的答案，我们标记了出现在复杂的质量检查对中的对象，该对象具有不同的语义类型，包括答案接地的对象（均出现并未出现在问题中），以及用于答案的对象的上下文对象。我们还提出了一个新的3D VQA框架，以有效地预测完全视觉扎根和可解释的答案。广泛的实验证明，我们新收集的基准数据集可有效地用于评估不同方面的各种3D VQA方法，而我们新提出的框架也可以在新的基准数据集中实现最新的性能。新收集的数据集和我们的代码都将在http://github.com/zlccccc/3dgqa上公开获得。

translated by 谷歌翻译

Promptagator: Few-shot Dense Retrieval From 8 Examples

Zhuyun Dai , Vincent Y. Zhao , Ji Ma , Yi Luan , Jianmo Ni , Jing Lu , Anton Bakalov , Kelvin Guu , Keith B. Hall , Ming-Wei Chang

分类：自然语言处理

2022-09-23

关于信息检索的许多最新研究集中在如何从一项任务（通常具有丰富的监督数据）转移到有限的其他各种任务，并隐含地假设可以从一个任务概括到所有其余的任务。但是，这忽略了这样一个事实，即有许多多样化和独特的检索任务，每个任务都针对不同的搜索意图，查询和搜索域。在本文中，我们建议使用几乎没有散热的检索，每个任务都有一个简短的描述和一些示例。为了扩大一些示例的功能，我们提出了针对检索器（即将到来）的及时基本查询生成，该查询将大型语言模型（LLM）作为几个弹片查询生成器，并根据生成的数据创建特定于任务的检索器。通过LLM的概括能力提供动力，即要来源使得可以仅基于一些示例{没有自然问题或MS MARCO来训练％问题生成器或双重编码器，就可以仅基于一些示例{没有}来创建特定于任务的端到端检索。出乎意料的是，LLM提示不超过8个示例，允许双重编码器在MARCO（例如Colbert V2）上训练的大量工程模型平均在11个检索套件中超过1.2 NDCG。使用相同生成数据的进一步培训标准尺寸的重新级别可获得5.0点NDCG的改进。我们的研究确定，查询产生比以前观察到的更有效，尤其是在给出少量特定于任务知识的情况下。

translated by 谷歌翻译

Recurrence-free Survival Prediction under the Guidance of Automatic Gross Tumor Volume Segmentation for Head and Neck Cancers

Kai Wang , Yunxiang Li , Michael Dohopolski , Tao Peng , Weiguo Lu , You Zhang , Jing Wang

分类：计算机视觉 | 机器学习

2022-09-22

对于头颈癌（HNC）患者管理，自动总肿瘤量（GTV）细分和准确的治疗前癌症复发预测对于协助医师设计个性化管理计划非常重要，这有可能改善治疗结果和治疗结果和HNC患者的生活质量。在本文中，我们基于HNC患者的组合预处理正电子发射断层扫描/计算机发射断层扫描（PET/CT）扫描，开发了一种自动原发性肿瘤（GTVP）和淋巴结（GTVN）分割方法。我们从分段的肿瘤体积中提取了放射素学特征，并构建了多模式肿瘤复发生存率（RFS）预测模型，该模型融合了预测由单独的CT放射线学，PET放射线学和临床模型融合在一起。我们进行了5倍的交叉验证，以训练和评估MICCAI 2022头和颈部肿瘤分割和结果预测挑战（Hecktor）数据集的方法。 GTVP和GTVN分割的测试队列的集合预测分别达到0.77和0.73，RFS预测的C-指数值为0.67。该代码公开可用（https://github.com/wangkaiwan/hecktor-2022-airt）。我们团队的名字叫艾特。

translated by 谷歌翻译

Federated Learning from Pre-Trained Models: A Contrastive Learning Approach

Yue Tan , Guodong Long , Jie Ma , Lu Liu , Tianyi Zhou , Jing Jiang

分类：人工智能 | 机器学习

2022-09-21

联合学习（FL）是一种机器学习范式，允许分散的客户在不共享其私人数据的情况下进行协作学习。但是，过度的计算和沟通要求对当前的FL框架构成挑战，尤其是在训练大型模型时。为了防止这些问题阻碍FL系统的部署，我们提出了一个轻巧的框架，客户共同学习融合由多个固定预训练的模型生成的表示形式，而不是从SCRATCH培训大型模型。这通过考虑如何从预先训练的模型中捕获更多特定于客户的信息，并共同提高每个客户利用这些现成模型的能力，从而导致我们解决了一个更实用的FL问题。在这项工作中，我们设计了一种联合原型对比度学习（FEDPCL）方法，该方法通过其类原型共享客户的知识，并以原型对比度方式构建特定于客户的表示。共享原型而不是可学习的模型参数可以使每个客户以个性化的方式融合表示表示，同时以紧凑的形式保持共享知识以进行有效的通信。我们在轻量级框架中对拟议的FEDPCL进行了彻底的评估，以测量和可视化其在流行的FL数据集上融合各种预训练模型的能力。

translated by 谷歌翻译

Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

Ziming Wang , Xiaoliang Huo , Zhenghao Chen , Jing Zhang , Lu Sheng , Dong Xu

分类：计算机视觉

2022-08-31

点云注册旨在估计两点云扫描之间的几何变换，在该点对应的估计中是其成功的关键。除了先前通过手工制作或学习的几何特征寻求对应的方法外，最近的点云注册方法还尝试应用RGB-D数据以实现更准确的对应关系。但是，有效地融合了这两种独特方式的几何和视觉信息并不是微不足道的，尤其是对于注册问题而言。在这项工作中，我们提出了一种新的几何感知视觉特征提取器（给出），该提取器采用多尺度的本地线性转换来逐步融合这两种方式，其中深度数据的几何特征是几何依赖于几何依赖的卷积内核来转换RGB数据的视觉功能。最终的视觉几何特征位于典型的特征空间中，由于几何变化引起的视觉差异可缓解，因此可以实现更可靠的对应关系。提出的给出的模块可以很容易地插入最近的RGB-D点云注册框架中。在3D匹配和扫描仪上进行的广泛实验表明，即使没有信件或姿势监督，我们的方法即使在没有通信或姿势的情况下也优于最先进的点云注册方法。该代码可在以下网址获得：https：//github.com/514DNA/llt。

translated by 谷歌翻译

A deep learning framework for geodesics under spherical Wasserstein-Fisher-Rao metric and its application for weighted sample generation

Yang Jing , Jiaheng Chen , Lei Li , Jianfeng Lu

分类：机器学习

2022-08-25

Wasserstein-Fisher-Rao（WFR）距离是一个指标家族，用于评估两种ra措施的差异，这同时考虑了运输和重量的变化。球形WFR距离是WFR距离的投影版本，以实现概率措施，因此配备了WFR的ra尺度空间可以在概率测量的空间中，用球形WFR视为公式锥。与Wasserstein距离相比，在球形WFR下对大地测量学的理解尚不清楚，并且仍然是持续的研究重点。在本文中，我们开发了一个深度学习框架，以计算球形WFR指标下的大地测量学，并且可以采用学习的大地测量学来生成加权样品。我们的方法基于球形WFR的Benamou-Brenier型动态配方。为了克服重量变化带来的边界约束的困难，将基于反向映射的kullback-leibler（KL）发散术语引入成本函数。此外，引入了使用粒子速度的新的正则化项，以替代汉密尔顿 - 雅各比方程的动态公式中的潜力。当用于样品生成时，与先前的流量模型相比，与给定加权样品的应用相比，我们的框架可能对具有给定加权样品的应用有益。

translated by 谷歌翻译